between the k mer patterns from multiple sequences was

d in a different way. First, a F statistic was calculated using the

g equation,

ܨൌ෍minሺܰሺߠሻ, ܰሺߠሻሻ

minሺܮ, ܮሻെ݇൅1

ఏ∈஀

(7.16)

was a set of all k-mers, ߠ was one of k-mer, ܰሺߠሻ and ܰሺߠሻ

r the frequency of ߠ in two sequences and ܮ1 as well as ܮ2

ed lengths of two sequences. The distance between two

s was then calculated as

݀ൌlogሺ0.1 ൅ܨሻെlogሺ1.1ሻ

logሺ0.1ሻ

(7.17)

ode shown below was used to apply the kmer package to analyse

sequences:

y(ape)

y(kmer)

y(insect)

y(Biostrings)

FASTA('SARS.HIV.fasta')

luster(x),horiz=TRUE)

7.12. The tree generated by the kmer package for 17 genome sequences.

e 7.12 shows the tree generated by kmer for these 17 genome

s. It shows the same pattern as seen in Figure 7.10 and Figure